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THE COALESCENT EFFECTIVE SIZE OF AGE-STRUCTURED 

POPULATIONS^ 

By Serik Sagitov and Peter Jagers 

Chalmers University of Technology 

We establish convergence to the Kingman coalescent for a class 
of age-structured population models with time-constant population 
size. Time is discrete with unit called a year. Offspring numbers in a 
year may depend on mother's age. 

1. Introduction. The well-known coalescent process describes how fam- 
ily lines merge in a sample from a large population, when time is traced back- 
ward. It is a continuous time Markov chain which keeps record of branches 
starting from n leaves and going through n — 1 pairwise mergers toward the 
root of a so-called ultrametric tree. The number of branches is reduced from 
k,2 < k < n, to k — 1 at the rate (g). At each reduction, a random pair of 
branches is replaced by a single branch. 

Initially [9], the coalescent was obtained as an approximation of the ge- 
nealogical tree of the Wright-Fisher model (WFM) with a large population 
size N and the unit of coalescent time corresponding to nonoverlapping 
generations. Several papers (see [7, 8, 10, 12, 15], as well as Section 4 for an 
overview) have shown that the coalescent approximation applies more gen- 
erally. Then the coalescent time turns into ~ N/c generations for some 
positive constant c determined by the particular features of the population 
model under consideration. Following [15], we call the coalescent effective 
size (CES) of such a population model. 

In the terminology of [17], the existence of a CES in the context of pop- 
ulation genetic data is equivalent to a situation in which it is not possible 
to reject the basic WFM. This means that the coalescence pattern is indis- 
tinguishable from that of a WFM. The CES is a narrower version of the 
classical genetical concept of an inbreeding effective size (lES) designed to 
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compare the rate of genetic drift in a given model with that of the WFM 
(see [5] ) . The existence of a CES impUes that the lES exists as weU and takes 
the same value. The reverse is not true: an lES may exist while a CES is 
absent (as in the case of convergence to a coalescent with multiple mergers 
[16]). 

This paper looks for a general CES formula in the case of overlapping 
generations. The best known constant size genetic population model with 
overlapping generations is the Moran model, assuming that each unit of 
time one individual is killed and another produces an offspring, so that the 
population size A'^ remains constant over time. It is straightforward to verify 
that in this case the coalescent approximation holds with the coalescent time 
unit equal to units of the Moran model time. Since the generation time 

in the Moran model is N, this implies the existence of a CES with N^. ~ N/2. 
Here and elsewhere, xjy ~ means that xn/vn — 1 as ^ oo. 

An age-structured version of the WFM, introduced in [4], discerns among 
A age groups of constant sizes N{a) = q{a)N, a = 1,2, A. In contrast to 
the basic WFM, the full backward description of an age-structured ancestral 
process should include age-labelling of lineages. In terms of the probability 
p{a) that a randomly chosen new-born individual has a parent of age 1 < 
a < A, p{l) + ■ ■ ■ + p{A) = 1, the lineage back of an individual in the age- 
structured WFM exhibits ages with the transition probabilities 

rp(l) p{2) p{3) ••• p{A-l) p{A)l 
1 ••• 

(1) 1 ••• , 

. ••• 1 . 

the subdiagonal ones of course mirroring that an (a + l)-aged individual is 
viewed as stemming from an a-aged the preceding year, namely, herself. It 
is easy to verify that 

7 = (7(1),...,7(^)), 

with 

j{a) = -{p{a) + ---+piA)), 

(2) ^ 

7 = ^ap(o), 

is the corresponding stationary distribution of ages (in analogy with the 
stable age distribution of branching processes or deterministic age-structured 
population models, cf. [6]). Clearly, 7(1) = I/7. 



EFFECTIVE POPULATION SIZE 



3 



According to [4], the lES of the age-structured WFM is ~ A^/(cage7), 
where 

under the convention -4n\ = 0. This lES formula takes into account that the 

9(0) 



generation time of the age-structured WFM is 7. In the particular case of 
constant fertility across ages, p{a) = q{a). Hence, with ^{A -|- 1) = 0, 



"age — ^ 
a=l 

A 



7^(a)-7^(a-H) 
q{a) 



^ (g(a)/7 )(27(a)-(g(a)/7)) 



a=l 



q{a) 



and the effective population size becomes ~ N/{2 — 7"^). This, in turn 
transforms to the formula for the Moran model as 7 — > cxo, though the Moran 
model certainly has no fixed age distribution. 

Just like the classical WFM, the offspring number of an a-aged individual 
in a large (A^ — > 00) age-structured WFM is asymptotically Poisson with the 
mean 

(4) m{a)=p{a)q{l)/q{a). 

In Section 2 we introduce an age-structured model allowing an arbitrary 
marginal reproduction law compatible with the constant population size 
assumption. The subject of our interest, the ancestral process of the age- 
structured population, is described at the end of Section 2. Our coalescent 
approximation result. Theorem 3.1, stated in Section 3, gives a CES formula 
for populations with exchangeable reproduction, which extends the CES 
formula ~ N/ (cage7) for the age-structured WFM. 

In Section 4 we interpret our CES formula in terms of earlier known 
formulae for geographically-structured WFMs with strong migration and for 
exchangeable populations with rapidly fluctuating sizes. Special attention is 
paid to the question whether is smaller than N. The final part of the 
paper is devoted to the proof of Theorem 3.1. 



2. An age-structured population model. Time is considered discrete with 
a unit to be called a year, for convenience. Let a = 1, . . . , A stand for the age 
of an individual, where ^ < cxd is the maximal possible age. Each year the 
population has the same size N and the age-composition also remains fixed. 



N=iN{l),...,N{A)), 
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so that 

N = N{1) + --- + N{A), N{1)>--->N{A). 

Individuals are assumed similar/exchangeable at least in the weak sense 
that all individuals of the same age have the same probabilities of surviving 
a year and have the same offspring number distribution. By the assumption 
of a fixed age structure, individuals can certainly not be independent of each 
other: if I survive, your chances diminish, and, similarly, if you have many 
kids a year, I tend to have few. But in some sense individuals should be 
interchangeable and we shall impose more of proper exchangeability, where 
needed. It is a good idea to visualize entities like the different individuals' 
lifespans or reproductions at various ages as exchangeable throughout, even 
though this is not always needed for the results. 

The age structure determines the age distribution in the population through 
q{a,N) := N(a)/N, and even the life span distribution: the probability of 
surviving year a, given that you have survived the preceding year, is 

N{a + l) _ q{a + l,N) 
N{a) ~ q{a,N) 

The survival function is the same for all individuals and determined by the 
products of yearly survival probabilities. If L denotes individual life span, 
thus 



and 



E(L) ^ 



We assume that 
(5) q{a, N) q{a) > 0, ^ oo. 

The vector of parameters q = (^(1), . ■ . , qiA)) then also describes the asymp- 
totic life span distribution in large populations, 

F{L>a) = ^, a = l,...,A, 



and 

E(L) 



/(!)' 
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Denote the one- year offspring numbers of the a-aged individuals by {i^i{cl)}i^i^ 
for a = 1, . . . , yl, assumed to be independent across age classes and exchange- 
able within them. Dropping the v-suffix for simplicity, we write m{a,N) : = 
E(i/(a)), and require that 

N(a) 

(6) ^ i^i{a)=N{a)m{a,N), l<a<A, 
1=1 

so that 

A 

N{l) = Y,Nia)m{a,N). 

a=l 

Again, assume that there is convergence 

(7) m{a, N) ^ m{a), N^oo, 
and, hence, that 

A 

q{l) = ^g(a)m(a). 

a=l 

This means that the p{a) := ^^m{a),a = 1,2, . . . , A, sum to one and give the 
asymptotic, so-called stable, distribution of age at childbearing in a critical 
population (cf. [6], Section 8.4), that is, one where the mean offspring number 
per individual equals one. Indeed, for fixed N, the expected yearly number 
of children of o-aged mothers is N(a)m{a,N). Since the total number of 
children born in a year is A^(l), the distribution of age at childbearing is 

Nia)mia,N)_,ia,Nl^^^^^^ 



Nil) q{l,N) 



and 



E{u{l) + ... + v{L)) = E ("^ ^ ^(a)l{L=o) 

\«=la=l / 



a=l 
A 



a=l 



since m{a, N) is precisely the expected offspring number one year of a surviv- 
ing a-aged individual. Of course, matters can not possibly stand otherwise 
when population size remains constant. 
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previous year 







Fig. 1. Forward picture. 



Having clarified the prospective view, we look backward, at the genealogy 
of n individuals sampled out of the population the present year. Let ZQ{a) 
denote the number of individuals of age a among them, so that Zq := Zq{1) + 
■ • • + Zo{A) = n. The vector Zq = (Zo(l), . . . , Zq{A)) is the initial state of a 
Markov chain called the ancestral process. Its state at time r is given by the 
numbers Zr = {Zr^i), ■ ■ ■ , Zr{A)) of ancestors of the sampled individuals r 
years ago, sorted by age. The total number of ancestors will be denoted by 

Zr := Zr{l) -\ h Zr{A). The Markov chain is time homogeneous and has a 

finite number of states. The class of states, where there is a single individual 
in one of the age classes, is absorbing. 

Now, consider a situation where k lines merge into one a-aged individual. 
If all the k individuals are newborns, whose mother was of age a, we talk 
of a merger of type one, and denote it by I{a,k). This situation can only 
occur if the mother is not in the sample. If she is, the k merging lines will 
be those of — 1 newborns and that from herself, one year later when she 
has attained age a + 1. Such mergers are said to be of type two and will be 
denoted II{a,k). 

Example. We illustrate the forward and backward views of this age- 
structured population model by two figures dealing with the case N = 16, 
A = i, N{1) = 6, N{2) = 4, N{3) = 3, iV(4) = 3. Figure 1 shows the develop- 
ment one year forward in time, the arrows indicating parent-offspring rela- 
tionships and aging. The nonzero offspring sizes are z^2(l) =1^4(1) = ^^2(4) = 
MA) = l; 1/1(2) =2. 

Figure 2 presents the retrospective view of the model by tracing n = 8 
ancestral lines one year back. We see one merger of type /(2, 2) and one 
merger of type II {1,2). Each of the two mergers reduces the number of 
branches by one. 
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3. Main result. A first assumption on asymptotics, that age distribu- 
tions q{a,N) and expected offspring numbers m{a,N) sliould converge as 
N — > (X), lias already been mentioned. In addition, we require that offspring 
variances stabilize. 



(8) 



E(i/2(a))^m(a) + y(a) 



iV ^ cx), l<a<A, 



and that third moments do not grow too quickly, 

(9) ^{u^{a)) = o{N), N^oo, 1 < a < A. 

Here the limits V{a) = limjv^oo E(i/(a)(i/(a) — 1)) are never negative, since 
z/(a)(z/(a) — 1) > 0. [The case V{a) = is not necessarily without interest, 
when there is an age-structure, since in this case individuals either may 
have given birth or not, i/(a) = or 1.] Whereas (8) serves to ensure that 
the time scale leading to the coalescent approximation is T^r = A^, condition 
(9) is aimed at prohibiting multiple mergers of ancestral lines (cf. [16]). 

Theorem 3.1. Assume (5), (6), (7), (8), and (9)-(6), interpreted to 
include the age-wise exchangebility and independence across ages mentioned 
before the equation itself. 

Then the weak convergence to the Kingman coalescent 



(10) 

holds with 

(11) 



' a=l 



' "^"^^(0)7(0 + 1) 



1 



+ 



q[a) 7^9^(1) „=i 

implying that a CES exists and satisfies 
(12) iVe~iV/(A7), N^oo. 



-Y^Via)q{a) 



current year 








age 1 


age 2 


age 3 


age 4 








[i j] 

• i i 


previous year 









Fig. 2. Backward picture. 
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Our proof of (11) relies on asymptotics of joint factorial moments of off- 
spring numbers within an age class. It follows from Section 6 of [7] that the 
lower moments satisfy 

(13) E{iyi{a)---Uj{a))^m^{a) 
and 

(14) E{4^\a)i^2{a) ■ ■ ■ i^jia)) ^ V{a)m^-\a), 
while the higher factorial moments are bounded by 

(15) E{u['''\a) ■ ■ ■ up\a)) = o{N^~^) if 6 := xi + ■ ■ ■ + xj - j >2, 

where u^^^ := — 1) ■ ■ ■ (i/ — x + 1) denotes the descending factorial power. 
These relations amount to a sort of asymptotic independence of offspring 
numbers within age classes. 

The result continues to hold if condition (6) is replaced by what is again 
a form of asymptotic independence, now between offspring numbers across 
age groups: for any 1 < m < A, and any natural ji, ■ ■ ■ ,jA] kn, . . . , ky-^ 
kjii , . . . , kjij^ , 

(16) E ( n n " («)) « n E f n " 

\a=lZ=l / a=l \l=l I 

Here and elsewhere, x^ ^ means xn = yN + o(l) and a product from one 
to zero equals one. 

Though not necessary, the natural interpretation of the above setup is 
that, as N ^ oo, offspring numbers converge in distribution and in for 
any p>0. The limiting random variables I'li^a) will then satisfy (16) with 
equality. If they are bounded, it follows by the Weierstrass approximation 
theorem, that joint distribution functions factorize analogously. Hence, in 
the limit, reproduction of parents in different age classes is independent. 

Such an extended version of Theorem 3.1 implies the existence of the CES 
of the age-structured WFM introduced in [4]. Indeed, the reproduction law 
of the age-structured WFM is given by the multinomial distribution 

Mn(Af(l); (/>(!), . , 0(1) , . . . , 4>{A), . , (PjA)] , 

N{1) times ^(^) times 

where (j){a) = Taking partial derivatives of the joint generation function 

/ A N{a) \ I A N{a) \N{1) 

E n n = E E ' 

\a=l 1=1 I \a=\ 1=1 I 



EFFECTIVE POPULATION SIZE 



9 



we obtain 

/ A N{a) \ A N(a) A N{a) 

E n n ^Ka)^'"'^ ~ n n iN{ma)f^^ ~ n n H^iic^)^'-'^)- 

\a=l 1=1 I a=l l=\ a=l l=\ 

Thus, the offspring numbers are asymptoticahy independent both across and 
within different age classes, ensuring (16). 

To see that A = Cagc, as defined by (3), observe that exactly as in the 
argument following the definition, 

^ 1 

a=l 

m -^|:f[^(7(a)+7(a+l)) 

7 h ^(«) 7^ h ^(«) ■ 

On the other hand, if the marginal distribution of the offspring number is 
asymptotically Poisson with mean (4), then V{a) = {p{a)q{\) / q{a))'^ , turn- 
ing the second term of (11) into that of the last expression above, so that 

A — Cage- 

4. The coalescent rates A, Cage? Cgeo and Cdem- The coalescent rate 
parameter A in (11) is a sum of two terms. From the derivation in Section 8, 
the first of these corresponds to a //(a, 2) merger, the second to one of type 
I {a, 2). Notice that the second term disappears in the case V{a) = 0, with 
at most one offspring possible at the age a. In this section we interpret the 
two terms using the known CES formulae for the population models with 1. 
exchangeable reproduction, 2. strong migration and 3. fast size fluctuation. 

The exchangeable haploid population model of [1, 2] is a flexible exten- 
sion of the basic WFM, allowing an arbitrary marginal distribution of the 
offspring number i'. According to [10], the CES of the exchangeable pop- 
ulation satisfies ~ -/V/Vhap, where Vhap = E(i/(;/ — 1)) is the variance of 
the offspring number (the WFM corresponds to the symmetrical multino- 
mial reproduction law with V^ap ~ 1). For the haploid model, T4iap can be 
arbitrarily close to zero, so that no upper bound on Nf. is obtained. In [13] 
the last result was extended to diploid exchangeable models with random 
mating. If the haploid size of the diploid population is N, then it is shown 
that A^'e ~ 4A/Vdip, where again Vdip = E(i/(i/ — 1)) but now z/ represents 
the number of diploid offspring to one couple. In this case Vjip < 2 and 
Ne < 2N, with the upper bound reached when one couple produces exactly 
two children. 
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An important case of CES due to fast size fluctuations is considered in 
[7], where the demographic fluctuations backward in time occur according 
to a stationary Markov chain with the possible values N{a) = q{a)N,a = 
1,2, . . . ,A, the transition probabilities are bij, i,j = 1,2, . . . , A, and the sta- 
tionary distribution is (7(1), .. . ,^{A)). For exchangeable reproduction, it is 
shown that Ng ~ N/ Cdcm with 

A A 

(18) Cdem='^'^l(.i)bijVijq{j)q~'^{i), 

i=ij=i 

where Vij = ^ij{v{u — 1)) measures variation in offspring numbers when the 
offspring generation size is q{i)N and the parent generation size is q{j)N . 
This formula can be read as follows: two ancestral lines merge during a 
q{i)N — > q{j)N backward size change at the rate equal to the rate 'y{i)bij 
of the size change times the conditional merger rate Vijq{j)q~'^{i). In the 
particular case of the WFM with fast fluctuations, the CES is approximated 
by the harmonic average of actual sizes {J2t=i q{i)N )~^' ^^'^^Y^ smaller than 
the arithmetic average N for nontrivial fluctuations. 

Formula (18) yields the following interpretation of the second term in 
(11) corresponding to a I{a,2) merger. For two lines to merge at the age 
group a as sister lines, they both should enter the age group o immediately 
after visiting the age group 1 which happens at rate (7(1)61^)^ = (p(a)/7)^. 
The corresponding conditional merger rate V {a)q{a) / (m{a)q{a)Y is equal 
to that of a m{a)q[a)N — > q{a)N backward size change. Multiplying these 
two rates and using (4) leads to the second term of (11). 

To interpret the first term of (11) corresponding to a //(a, 2) merger, we 
turn to the geographically structured WFM which is a key example in [15] 
illustrating the concept of CES. In this model a haploid population of con- 
stant size A'" is split into A subpopulations of constant sizes N{a) = q(a)N, 
so that q{l) -|- • • • -|- q{A) = 1. Followed backward in time, an individual mi- 
grates from subpopulation i to subpopulation j, with probability by, and 
chooses its parent uniformly at random among N(j) members of the parental 
subpopulation independently of other individuals. If migration is irreducible 
and aperiodic and (7(1), ... , 7(^)) is the stationary distribution of the back- 
ward migration process, then, according to [15], the CES of the structured 
WFM satisfies Ag ~ A/cgeo, where 

^ 1 

(19) Cgeo = E777T7'(«)- 

a=l 

As a formula for the inbreeding effective size (lES), (19) was discussed in [14], 
where it was pointed out that Cgco > 1, with the equality Cgco = 1 holding 
if only 7(a) = q{a) for all a. The meaning of (19) is clear: for two lines to 
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merge at the subpopulation a, they should be there at the same generation 
[rate 7^ (a)] and choose the same parent [rate jj^^]- 

The age-structured WFM is very similar to the geographically structured 
WFM with the transition probabilities ||6ij|| given by (1). However, (19) 
does not directly apply to the age-structured WFM, since individuals mi- 
grating backward in time from age group a to age group a — 1 sample their 
parents without replacement (thereby violating the assumption of indepen- 
dent choice of parents). Still, (19) helps in interpreting the first term of (11) 
corresponding to a //(a, 2) merger. For two lines to merge at the age group 
a as nonsister lines, they must enter the age group a along different routes. 
One of the lines visits the age group o immediately after visiting age group 
1, which happens at rate p(a)/7, while the other arrives through the age 
group a+1, which happens at rate 7(0 + 1). [Notice that (19) also simplifies 
understanding of the last term in (17).] 



70000 




20 40 60 80 100 120 

Fig. 3. Observed (solid line) and stationary (dashed line) age group sizes of Swedish 
female population versus age. 
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5. Effective versus actual population size. Recall that q{l) > q{2) > • • • > 
q{A) and 7(1) = I/7 to see that the first equality in (17) implies that 

(20) 7Cage > E(7'(«) - 7'(a + 1)) 



9(1) ' ' ' " 7^(1)' 



Therefore, the CES of an age-structured WFM has the upper bound Nf. < 
'yq{l)N, where the constant 7^(1) = 7/E[L] is the ratio between the average 
age at child-bearing and the average life length. Similarly, since 




A A 

(21) - = 7 E(7'(«) - 7'(« + 1)) = Ep(«)(p(«) + 27(« + 1)), 

a=l a=l 

2X:p(a)7(a + l) = ^fl-E/(«)), 

a=l ^ \ a=l } 

and we get a weaker upper bound 

(22) N,<^q{\)(\-Y^v\ 

\ a=\ 

for the age-structured model with exchangeable reproduction. 

These upper bounds could serve as fair estimates of CES in human and 
similar populations. For an illustration, we turn to Swedish official statistics 
for 2002 [18], yielding Figures 3 and 4, and A = 111, 7 = 30.6022, l/q{l) = 
82.6094. 

The CES from (12) with V{a) = is iVe = 0.3890iV. The age-structured 
WFM yields A^e = 0.36777V. (Cf. to Felsenstein's A^e = 0.347V for U.S. popu- 
lation data, [4].) From Figure 4, q{a) « q{l) for those a where 7(a) is not too 
small. This implies approximate equality in (20) and the above CES being 
close to the upper bounds (22), Ne = 0.3919iV and (20), A^e = 0.3707iV. 

6. The transition probability. In this section we derive an asymptotic 
formula for the one step transition probability 

that a group of u individuals with age distribution u = (u(l), . . . , u{A)) stems 
from a possibly smaller group of individuals from the previous year with age 
distribution v = (f (1), . . . , v[A)). We treat the parentage of newborns [m(1) 
individuals of age 1] at time r — 1 as balls to be allocated among N boxes 
(potential mothers) at time r. Box i in the age group a contains a random 
number fj(a) of slots (see Figure 2), and each slot can accept one ball. The 
meaning of such an allocation is, of course, that one of the u{\) individuals 
happens to be among the z/j(a) children of the ith a-aged indivdual. 
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Q 20 40 m 80 100 120 

Fig. 4. Solid line — p{a), dashed line — q{a), versus age a. 

Not newborn individuals, that is, of age a + 1 > 2 at time r — 1, just stem 
from themselves one year younger at time r, so v{a) > n(a+l), writing u{A+ 
1) = 0. Those remaining, a(a) = v{a) — u{a + 1), must then have given birth 
to newborns. Combinatorically, we divide the N{a) potential predecessors 
into u{a + 1) marked and N{a) — u{a + 1) unmarked boxes, in which balls 
can be placed to signify that the individual is among the predecessors. In 
Figure 2, for example, boxes are individuals in the previous year: counting 
from the left, we mark boxes 4 and 5 in the age group 1, box 3 in the age 
group 2 and, finally, box 1 in the age group 3. 

Given the allocation result v, we have a{a) = v{a) — u{a + 1) unmarked 
boxes hosting at least one ball. We write 

AAA 
a = {a{l),...,a{A)), u = ^u(a), u = ^u(a), a = ^a(a), 

a=l a=l a=l 




and notice that a = v — u + u{l). 
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(23) 



The next step is to show that 

A /v{a) 

xERe n 

X a=l \ 1=1 




where X = {X{1), . . . ,X{A)) and the vector X{a) = (a;i(a), . . . , (a)) 
gives the numbers of bahs in the v{a) boxes. Numbers xi(a), with indices 
1 <l < a{a), correspond to unmarked boxes, while the indices a{a) + 1 < 
I < v{a) are meant for the marked boxes, so that summation in (23) is over 
all distinct arrays X satisfying 

A v{a) 

(24) EE^'(«) =^(1)' ^/(a) >l{Ka(a)}- 

a=l 1=1 

To illustrate the notation introduced in this section, we refer to Figure 2, 
which follows u = 8 ancestral lines one year back in the case u = (4, 2, 1, 1), 
u = (2,2,l,l), v = 6. Here we have a = (0,1, 0,1) and X(l) = (1, 0), X{2) = 
(2,0), X(3) = (0),X(4) = (1). 

Proof of (23). First we calculate the transition probability 
when the set / of t; boxes is fixed. Here /= (1(1), . . . ,I{A)) and the vec- 
tor 1(a) = (ii(a), . . . ,ii,(a)(a)) lists the box positions taken from the set 
{1, . . . ,N{a)}. Again, positions ii{a), with indices I <l < a{a), correspond 
to unmarked boxes, while the indices a(a) + 1 < / < v{a) are meant for the 
marked boxes. [In the case of Figure 2 we have /(I) = (4,5), /(2) = (1,3), 
/(3) = (1), /(4) = (3).] According to the allocation rules, we have 

^M^) = -prA-l / N(a) . 

j lla=l \u{a+l)) 

The expression becomes independent of / due to the exchangeability assump- 
tion. It remains to multiply the last expression with the number Y\a=i (m^+i)) 
^Af(a)-n(a+i)-j ^£ ^g^yg choose appropriate sets I of hosting boxes and do 

some simple algebra to obtain (23). 

If t; < u — 2, then (15) and (23) together imply that Huv = o{N~^), mean- 
ing that multiple mergers of lineages are impossible in the limiting coales- 
cent. □ 
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7. Transitions with v > u — 1. We embark on a careful asymptotic anal- 
ysis of the transition probability (23) with the most common transition type, 
when no ancestral lines merge: v = u. In this case xi{a) = l{i<a{a)}^ which 
in accordance with (13) entails Huv ^uv, where 

(25) Auv = 4>{a)liu^jj}l{^=u} 

and := l{i;{i)>tt(2),...,t;(yi-i)>u{A)}- The fact that the asymptotic tran- 

sition probability takes the form of a multinomial distribution is easy to 
explain. As the set u of ancestral lines traced one year back does not change 
cardinality, the following happens. For o > 2, individuals of age a turn to 
individuals of age {a — 1), and lines from individuals of age 1 go to ages 
according to the multinomial Bernoulli scheme governed by the stationary 
distribution p = (p(l), . . . ,p{A)) of individual's age at childbearing. 

Let Afc = ||^iii)|U:«=fc,ij:i)=A: be the transition matrix at the level of k an- 
cestral lines. For /c = 1, if the states are ordered as (1, 0, . . . , 0, 0), (0, 1, . . . , 0, 0), . . . 
(0, 0, . . . , 0, 1), Ai is given by (1) with the stationary distribution (2). It is 
intuitively clear that, for an arbitrary k, the stationary distribution of the 
Markov chain with transition matrix is given by the following multino- 
mial distribution: 

(26) TTkiu) = (^^^^^ ^ ^^^^^^^ ) n 7(«)"^'^^ for u with u = k. 



a=l 



Nevertheless, we give a formal proof of this fact using a computation tech- 
nique which will be used later to produce less obvious results. Indeed, for 
any vector v with v = k, we have 



E Y. („(i),.l„(^))n7(<.r'" 

u:u=k u:u=k ^ " ■> \ J / ^=2 



v{l) - n(2), . . .,v{A) ) Hu^v], 



which equals 



Observe that the summation over u can be replaced by independent sum- 
mations over the components u{a), since 

u{a) < v{a — 1) < u{l) + u{a) 

= u- u{2) u{a - 1) - u{a + 1) u{A) 
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and the summation index u{a) is free to run between zero and 

min{i)(a — 1), u — u{2) — ■ ■ ■ — u{a — 1)} = v{a — 1), a = 2, . . . , A. 
Therefore, the last sum converts to 

11(1) v{A-l) A / ^(^Q-) \ 

«(2)=0 u{A)=0 a=2^ \ J / 

and we conlude that 

y: ^kiu)A,, = (^^^^^ ^^^^ ) n ^i^y^^^ = -^(^)- 

Next we turn to the case v = u — 1 of exactly one pairwise merger and 
prove that 11^^- ~ N~^Cuv, where 



A 



(27) C^, = c^{a)^Yl 



m{a)u{a + 1) + 



a{a)V{a) 



2m{a) 



l{M^f>}l{t)=«-l}- 



If f = n — 1, then xi{a) = ]1{Kci{6)} + l{a=fe,z=j} for some 6 G {1, . . . , A} and 
some J € {1, . . . , so that summation over X in (23) can be replaced by 

summation over h and j. Note that indices j < a(b) correspond to a single 
1(6,2) case, while indices j > a{b) correspond to a single //(6, 2) merger. 
The corresponding components of (23) are computed with help of (13) and 
(14): 



lim iVn« = </>(a)^nm(a)-"('^) 

' a=l 



,r "'"nil •r\'-^ /-I \ 

A a{b) 

2 



1 A ayo) 

x^EE^(^M^r^'^"'n"^("r^"^ 

6=1 j=l a^b 

2 9(1) ^m(6) 



and 



hm iVng) = <^(a)^nm(a)-"('^)|: ^ [] m(a)"('^)m(6) 

) 6=1 

These two parts put together confirm (27). 
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8. Proof of weak convergence toward the coalescent. Theorem 3.1 can 
be estabhshed following the proof of weak convergence to the coalescent 
presented in Section 5 of [15]. The proof is based on Theorem 2.12 of [3] and 
the next lemma from [11]. 

Lemma 8.1 (Mohle's lemma). If A is a stochastic matrix such that P = 
limfe^ooA'^ exists, then 

[Nt] 



lim ( A + —C + o 



Af— >oo 



N 



:P-I + e*^, 



where I is the identity matrix and G := PCP. 



So far we have computed the transition matrix 11 := ||n^fj|| decomposition 



n= A+^c+o 



1 



with A := C := ||Cfif}||. The only remaining calculation is to find the 

coalescence rate at level k defined as 



u:u=k 



where H(u) is the coalescence rate when the ancestor process is in configu- 
ration u 



H{u)= ^« 



v:v=u—l 



'"(^)' TT „(hVib)~u{b+i) 



According to (27) and (26) 

=k v. 
k]_ 

After switching the order of summation over v and u, and then regrouping 
the terms, we obtain 

A 



a=l 



m[a)u[a + 1) + 



a{a)V{a) 
2m{a) 



k 



Ck 



IT S 



9(1),-,. 



V : v=k — l 



k-1 

v{l),...,viA) 



Y[p{b) 



v{b) 



6=1 



7 11 I u{b) J [p{b-l) 



u : u=k 
A r 



b=2 



u{b) 



E 

a=l 



m{a)u{a + 1) + 



2m(a) 



{v{a) - u{a + 1)) 



'-{u— »!)} ■ 
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Opening the square brackets, we split the expression in two terms Ck = 
c'f^ + c'l- The first term equals (after putting the summation over a in front 
of other sums and representing the summation over u as a multiple sum) 



X E ••• E ^(«+i)n 

Ji(2)=0 u(A)=0 b=2 
X 7~"^^^l{M{i)=fc-«(2) u{A)}- 



(b) J\p{b-l) 



Since 



i=0 



i J \p{b -I) J \ p{b- 1) 



we have 



X 

u(a+l)=0 



(a) \ /7(a + l)7V('^+') 



a=l D:i'=fc— 1 67^1 

i;(a) 

E + 



Applying 

E' " 



'n\ / 7(a + l)7 y _ ^/^7(a + 1)7 V 7(0)7 



i=0 



i / V p{a) J \ p{a) J V p{a) 



to the last sum yields 



k 



and the relation 



E H.,.;:..jn7(<.r'7(<.)-' 

«iH HA=n b^a 

(29) 
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implies 

The second term is calculated similarly. From 

„ _ k{k-l) V{a) 
2q{l) ^^m{a) 



X 

V : v=k 



^ , ■■■Ma- l),v{a) -l,via + 1), • • ■MA) ) n^'(^r^'^ 



^ li U(5+l) I + I 11 I m 

«(2)=0 «(^)=OMa ^ ^ J/ \ \ b=l^ 

X7 1{«(1)=A:-m(2) u{A)} 



and (28), we derive 



4 



k\ 1 A y(a) 



,2/ 9(1)7^ ^1 "i(a) 

^.^'r^.i v'^(i)'--- ii; 

This together with (29) yields 



„ k\ 1 ^ V{a) , . 
V^/ ?(1)7 ^ m{a) 

We conclude that = (2) A, with 

^ = E "^(«)7(« + 1) + E ^^A^)- 

9(1)7 ^1 9(1)7^ ^1 "i(a) 

In view of (4), this leads to (11). 
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